Overview

Dataset statistics

Number of variables13
Number of observations2956
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory300.3 KiB
Average record size in memory104.0 B

Variable types

Numeric13

Warnings

monetary is highly correlated with qtde_invoices and 1 other fieldsHigh correlation
qtde_invoices is highly correlated with monetary and 2 other fieldsHigh correlation
qtde_items is highly correlated with monetary and 1 other fieldsHigh correlation
qtde_products is highly correlated with qtde_invoicesHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
qtde_returns is highly correlated with avg_ticketHigh correlation
avg_basket_size is highly correlated with avg_ticketHigh correlation
monetary is highly correlated with qtde_invoices and 3 other fieldsHigh correlation
recency_days is highly correlated with qtde_invoicesHigh correlation
qtde_invoices is highly correlated with monetary and 3 other fieldsHigh correlation
qtde_items is highly correlated with monetary and 3 other fieldsHigh correlation
qtde_products is highly correlated with monetary and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with monetary and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtde_products and 1 other fieldsHigh correlation
monetary is highly correlated with qtde_items and 1 other fieldsHigh correlation
qtde_invoices is highly correlated with qtde_itemsHigh correlation
qtde_items is highly correlated with monetary and 3 other fieldsHigh correlation
qtde_products is highly correlated with monetary and 1 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with qtde_itemsHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
monetary is highly correlated with qtde_returns and 4 other fieldsHigh correlation
qtde_returns is highly correlated with avg_ticket and 5 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with avg_basket_sizeHigh correlation
avg_basket_size is highly correlated with avg_ticket and 4 other fieldsHigh correlation
qtde_products is highly correlated with monetary and 3 other fieldsHigh correlation
qtde_invoices is highly correlated with monetary and 3 other fieldsHigh correlation
qtde_items is highly correlated with monetary and 4 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 25.0178421) Skewed
frequency is highly skewed (γ1 = 25.05874785) Skewed
qtde_returns is highly skewed (γ1 = 23.49789957) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 33 (1.1%) zeros Zeros
qtde_returns has 1480 (50.1%) zeros Zeros

Reproduction

Analysis started2021-08-18 15:56:07.854033
Analysis finished2021-08-18 15:56:45.393421
Duration37.54 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2956
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2312.443505
Minimum0
Maximum5701
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:45.599318image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile184.75
Q1924.5
median2116.5
Q33531.75
95-th percentile5025.5
Maximum5701
Range5701
Interquartile range (IQR)2607.25

Descriptive statistics

Standard deviation1552.846475
Coefficient of variation (CV)0.6715175839
Kurtosis-1.0154754
Mean2312.443505
Median Absolute Deviation (MAD)1269
Skewness0.3413956713
Sum6835583
Variance2411332.176
MonotonicityStrictly increasing
2021-08-18T12:56:45.860723image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
30051
 
< 0.1%
29901
 
< 0.1%
29931
 
< 0.1%
29941
 
< 0.1%
29951
 
< 0.1%
29961
 
< 0.1%
29991
 
< 0.1%
30011
 
< 0.1%
30021
 
< 0.1%
Other values (2946)2946
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57011
< 0.1%
56821
< 0.1%
56721
< 0.1%
56661
< 0.1%
56451
< 0.1%
56411
< 0.1%
56351
< 0.1%
56241
< 0.1%
56231
< 0.1%
56131
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2956
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.85555
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:46.131556image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12618.25
Q113801.25
median15222
Q316767.25
95-th percentile17964.25
Maximum18287
Range5940
Interquartile range (IQR)2966

Descriptive statistics

Standard deviation1717.660762
Coefficient of variation (CV)0.112479668
Kurtosis-1.203433731
Mean15270.85555
Median Absolute Deviation (MAD)1486.5
Skewness0.03069983665
Sum45140649
Variance2950358.492
MonotonicityNot monotonic
2021-08-18T12:56:46.395494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
175881
 
< 0.1%
149051
 
< 0.1%
161031
 
< 0.1%
146261
 
< 0.1%
148681
 
< 0.1%
182461
 
< 0.1%
171151
 
< 0.1%
166111
 
< 0.1%
159121
 
< 0.1%
Other values (2946)2946
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

monetary
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2942
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2645.419838
Minimum6.2
Maximum272345.66
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:46.679592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile226.315
Q1556.4725
median1064.525
Q32245.6675
95-th percentile7087.02
Maximum272345.66
Range272339.46
Interquartile range (IQR)1689.195

Descriptive statistics

Standard deviation9989.458288
Coefficient of variation (CV)3.776133431
Kurtosis401.0034374
Mean2645.419838
Median Absolute Deviation (MAD)652.88
Skewness17.73829894
Sum7819861.04
Variance99789276.88
MonotonicityNot monotonic
2021-08-18T12:56:46.961267image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
951.542
 
0.1%
1736.142
 
0.1%
1133.252
 
0.1%
695.422
 
0.1%
308.322
 
0.1%
490.222
 
0.1%
296.552
 
0.1%
1682.82
 
0.1%
175.922
 
0.1%
740.952
 
0.1%
Other values (2932)2936
99.3%
ValueCountFrequency (%)
6.21
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.061
< 0.1%
43.081
< 0.1%
451
< 0.1%
521
< 0.1%
52.21
< 0.1%
61.651
< 0.1%
67.671
< 0.1%
ValueCountFrequency (%)
272345.661
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
136626.961
< 0.1%
120193.931
< 0.1%
115727.351
< 0.1%
87716.481
< 0.1%
72882.091
< 0.1%
62078.121
< 0.1%
60402.221
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.26488498
Minimum0
Maximum373
Zeros33
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:47.243817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.81710432
Coefficient of variation (CV)1.210880629
Kurtosis2.780282763
Mean64.26488498
Median Absolute Deviation (MAD)26
Skewness1.799627449
Sum189967
Variance6055.501724
MonotonicityNot monotonic
2021-08-18T12:56:47.515114image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
285
 
2.9%
385
 
2.9%
876
 
2.6%
1067
 
2.3%
966
 
2.2%
766
 
2.2%
1764
 
2.2%
2255
 
1.9%
Other values (262)2206
74.6%
ValueCountFrequency (%)
033
 
1.1%
199
3.3%
285
2.9%
385
2.9%
487
2.9%
543
1.5%
766
2.2%
876
2.6%
966
2.2%
1067
2.3%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3652
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

qtde_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct54
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.720906631
Minimum1
Maximum202
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:47.806627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum202
Range201
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.818852824
Coefficient of variation (CV)1.541513154
Kurtosis187.4827702
Mean5.720906631
Median Absolute Deviation (MAD)2
Skewness10.67296559
Sum16911
Variance77.77216513
MonotonicityNot monotonic
2021-08-18T12:56:48.093476image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2783
26.5%
3498
16.8%
4387
13.1%
5237
 
8.0%
1188
 
6.4%
6175
 
5.9%
7136
 
4.6%
899
 
3.3%
967
 
2.3%
1055
 
1.9%
Other values (44)331
11.2%
ValueCountFrequency (%)
1188
 
6.4%
2783
26.5%
3498
16.8%
4387
13.1%
5237
 
8.0%
6175
 
5.9%
7136
 
4.6%
899
 
3.3%
967
 
2.3%
1055
 
1.9%
ValueCountFrequency (%)
2021
< 0.1%
1991
< 0.1%
1231
< 0.1%
971
< 0.1%
912
0.1%
851
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

qtde_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1577
Distinct (%)53.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1381.422869
Minimum1
Maximum177148
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:48.381636image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile93
Q1267
median554.5
Q31211.75
95-th percentile3862.5
Maximum177148
Range177147
Interquartile range (IQR)944.75

Descriptive statistics

Standard deviation4996.234518
Coefficient of variation (CV)3.616730714
Kurtosis570.0613135
Mean1381.422869
Median Absolute Deviation (MAD)360.5
Skewness19.68192671
Sum4083486
Variance24962359.36
MonotonicityNot monotonic
2021-08-18T12:56:48.651073image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
889
 
0.3%
1509
 
0.3%
2609
 
0.3%
2009
 
0.3%
2409
 
0.3%
3068
 
0.3%
848
 
0.3%
2728
 
0.3%
2468
 
0.3%
3608
 
0.3%
Other values (1567)2871
97.1%
ValueCountFrequency (%)
11
 
< 0.1%
22
0.1%
122
0.1%
163
0.1%
171
 
< 0.1%
181
 
< 0.1%
201
 
< 0.1%
231
 
< 0.1%
252
0.1%
261
 
< 0.1%
ValueCountFrequency (%)
1771481
< 0.1%
699931
< 0.1%
663681
< 0.1%
644931
< 0.1%
641241
< 0.1%
522591
< 0.1%
520131
< 0.1%
402071
< 0.1%
399841
< 0.1%
369781
< 0.1%

qtde_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct453
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean117.9333559
Minimum1
Maximum7599
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:48.930874image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q128
median64
Q3130.25
95-th percentile369.5
Maximum7599
Range7598
Interquartile range (IQR)102.25

Descriptive statistics

Standard deviation259.4387994
Coefficient of variation (CV)2.199876341
Kurtosis362.2230087
Mean117.9333559
Median Absolute Deviation (MAD)42
Skewness15.86797052
Sum348611
Variance67308.49065
MonotonicityNot monotonic
2021-08-18T12:56:49.202759image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2851
 
1.7%
1536
 
1.2%
3535
 
1.2%
1133
 
1.1%
2733
 
1.1%
2432
 
1.1%
2032
 
1.1%
1932
 
1.1%
2532
 
1.1%
2931
 
1.0%
Other values (443)2609
88.3%
ValueCountFrequency (%)
15
 
0.2%
213
0.4%
316
0.5%
415
0.5%
526
0.9%
630
1.0%
721
0.7%
825
0.8%
923
0.8%
1030
1.0%
ValueCountFrequency (%)
75991
< 0.1%
53371
< 0.1%
50951
< 0.1%
42221
< 0.1%
26301
< 0.1%
23261
< 0.1%
19051
< 0.1%
18061
< 0.1%
15771
< 0.1%
14871
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2953
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.12601685
Minimum2.25375
Maximum4453.43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:49.487911image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2.25375
5-th percentile5.040963463
Q113.45986335
median18.22775
Q325.28530327
95-th percentile87.30529872
Maximum4453.43
Range4451.17625
Interquartile range (IQR)11.82543992

Descriptive statistics

Standard deviation120.096576
Coefficient of variation (CV)3.625445721
Kurtosis802.2345013
Mean33.12601685
Median Absolute Deviation (MAD)5.982908497
Skewness25.0178421
Sum97920.50581
Variance14423.18757
MonotonicityNot monotonic
2021-08-18T12:56:49.752588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.4082
 
0.1%
152
 
0.1%
18.3752
 
0.1%
18.152222221
 
< 0.1%
17.077741941
 
< 0.1%
20.511041671
 
< 0.1%
149.0251
 
< 0.1%
21.759459461
 
< 0.1%
12.9491
 
< 0.1%
13.927368421
 
< 0.1%
Other values (2943)2943
99.6%
ValueCountFrequency (%)
2.253751
< 0.1%
2.475056181
< 0.1%
2.5227160491
< 0.1%
2.7552941181
< 0.1%
2.7661538461
< 0.1%
2.81671
< 0.1%
2.8255769231
< 0.1%
2.862841531
< 0.1%
2.8756783221
< 0.1%
2.9053306611
< 0.1%
ValueCountFrequency (%)
4453.431
< 0.1%
3202.921
< 0.1%
1687.21
< 0.1%
1102.361
< 0.1%
952.98751
< 0.1%
859.441
< 0.1%
663.25714291
< 0.1%
651.16833331
< 0.1%
624.41
< 0.1%
615.751
< 0.1%

avg_recency_days
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION

Distinct1254
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-67.18966855
Minimum-366
Maximum-1
Zeros0
Zeros (%)0.0%
Negative2956
Negative (%)100.0%
Memory size23.2 KiB
2021-08-18T12:56:50.017638image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum-366
5-th percentile-200
Q1-85.33333333
median-48.5
Q3-25.88928571
95-th percentile-7.96875
Maximum-1
Range365
Interquartile range (IQR)59.44404762

Descriptive statistics

Standard deviation63.38252873
Coefficient of variation (CV)-0.9433374222
Kurtosis4.955966636
Mean-67.18966855
Median Absolute Deviation (MAD)26.35416667
Skewness-2.072623922
Sum-198612.6602
Variance4017.344948
MonotonicityNot monotonic
2021-08-18T12:56:50.283448image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1424
 
0.8%
-422
 
0.7%
-7021
 
0.7%
-720
 
0.7%
-3518
 
0.6%
-4918
 
0.6%
-2117
 
0.6%
-4617
 
0.6%
-1117
 
0.6%
-516
 
0.5%
Other values (1244)2766
93.6%
ValueCountFrequency (%)
-3661
 
< 0.1%
-3651
 
< 0.1%
-3631
 
< 0.1%
-3621
 
< 0.1%
-3572
0.1%
-3561
 
< 0.1%
-3552
0.1%
-3521
 
< 0.1%
-3512
0.1%
-3503
0.1%
ValueCountFrequency (%)
-116
0.5%
-1.51
 
< 0.1%
-213
0.4%
-2.51
 
< 0.1%
-2.6013986011
 
< 0.1%
-315
0.5%
-3.3214285711
 
< 0.1%
-3.3909090911
 
< 0.1%
-3.52
 
0.1%
-422
0.7%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1219
Distinct (%)41.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1133185025
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:50.563214image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008888888889
Q10.01633986928
median0.0259553127
Q30.04956870588
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.0332288366

Descriptive statistics

Standard deviation0.4076538922
Coefficient of variation (CV)3.597416866
Kurtosis998.6544668
Mean0.1133185025
Median Absolute Deviation (MAD)0.01217802703
Skewness25.05874785
Sum334.9694934
Variance0.1661816958
MonotonicityNot monotonic
2021-08-18T12:56:50.834243image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1196
 
6.6%
0.0277777777817
 
0.6%
0.062517
 
0.6%
0.0238095238116
 
0.5%
0.0833333333315
 
0.5%
0.0909090909115
 
0.5%
0.0344827586214
 
0.5%
0.0294117647114
 
0.5%
0.0769230769214
 
0.5%
0.0217391304313
 
0.4%
Other values (1209)2625
88.8%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
25
 
0.2%
1.51
 
< 0.1%
1.1428571431
 
< 0.1%
1196
6.6%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.54010695191
 
< 0.1%
0.53351206431
 
< 0.1%

qtde_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct206
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.32476319
Minimum0
Maximum9014
Zeros1480
Zeros (%)50.1%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:51.122593image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q38
95-th percentile87.25
Maximum9014
Range9014
Interquartile range (IQR)8

Descriptive statistics

Standard deviation273.6783149
Coefficient of variation (CV)8.466521882
Kurtosis672.3503412
Mean32.32476319
Median Absolute Deviation (MAD)0
Skewness23.49789957
Sum95552
Variance74899.82004
MonotonicityNot monotonic
2021-08-18T12:56:51.900367image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01480
50.1%
1169
 
5.7%
2150
 
5.1%
3106
 
3.6%
490
 
3.0%
678
 
2.6%
560
 
2.0%
1250
 
1.7%
944
 
1.5%
743
 
1.5%
Other values (196)686
23.2%
ValueCountFrequency (%)
01480
50.1%
1169
 
5.7%
2150
 
5.1%
3106
 
3.6%
490
 
3.0%
560
 
2.0%
678
 
2.6%
743
 
1.5%
842
 
1.4%
944
 
1.5%
ValueCountFrequency (%)
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
32171
< 0.1%
28781
< 0.1%
22561
< 0.1%
20221
< 0.1%
20121
< 0.1%
15941
< 0.1%
15341
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1920
Distinct (%)65.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean205.6046041
Minimum1
Maximum6009.333333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:52.167498image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile41.9375
Q193.5
median152.4166667
Q3243.0833333
95-th percentile510.6875
Maximum6009.333333
Range6008.333333
Interquartile range (IQR)149.5833333

Descriptive statistics

Standard deviation251.7862728
Coefficient of variation (CV)1.224613981
Kurtosis154.3071838
Mean205.6046041
Median Absolute Deviation (MAD)69.85119048
Skewness9.478244221
Sum607767.2098
Variance63396.32716
MonotonicityNot monotonic
2021-08-18T12:56:52.413544image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10013
 
0.4%
8211
 
0.4%
11410
 
0.3%
1409
 
0.3%
489
 
0.3%
1209
 
0.3%
918
 
0.3%
818
 
0.3%
888
 
0.3%
1038
 
0.3%
Other values (1910)2863
96.9%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
61
< 0.1%
71
< 0.1%
82
0.1%
8.3333333331
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3224.651
< 0.1%
28801
< 0.1%
2460.3888891
< 0.1%
24411
< 0.1%
2323.0769231
< 0.1%
1866.9333331
< 0.1%
1826.3333331
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct898
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.74430953
Minimum0.2
Maximum246
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.2 KiB
2021-08-18T12:56:52.683356image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile2
Q17.5
median13
Q321.39236111
95-th percentile43
Maximum246
Range245.8
Interquartile range (IQR)13.89236111

Descriptive statistics

Standard deviation14.73738397
Coefficient of variation (CV)0.8801428301
Kurtosis29.55696068
Mean16.74430953
Median Absolute Deviation (MAD)6.392307692
Skewness3.458822346
Sum49496.17896
Variance217.1904864
MonotonicityNot monotonic
2021-08-18T12:56:52.959644image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1152
 
1.8%
1245
 
1.5%
1442
 
1.4%
1641
 
1.4%
538
 
1.3%
838
 
1.3%
938
 
1.3%
1338
 
1.3%
1737
 
1.3%
7.535
 
1.2%
Other values (888)2552
86.3%
ValueCountFrequency (%)
0.21
 
< 0.1%
0.252
 
0.1%
0.31818181821
 
< 0.1%
0.33333333336
0.2%
0.41
 
< 0.1%
0.510
0.3%
0.54545454551
 
< 0.1%
0.57142857141
 
< 0.1%
0.61764705881
 
< 0.1%
0.6251
 
< 0.1%
ValueCountFrequency (%)
2461
< 0.1%
173.51
< 0.1%
1371
< 0.1%
1261
< 0.1%
1011
< 0.1%
98.51
< 0.1%
93.51
< 0.1%
931
< 0.1%
921
< 0.1%
91.333333331
< 0.1%

Interactions

2021-08-18T12:56:10.896412image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:11.088624image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:11.276243image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:11.455837image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:11.640265image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:11.817450image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:12.008162image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:12.262936image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:12.468755image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:12.645728image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:12.829969image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:12.995589image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:13.167340image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:13.334257image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:13.510308image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:13.768939image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:13.983850image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:14.182007image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:14.430683image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:14.661061image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:14.842098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:15.013998image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:15.200621image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:15.375500image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:15.539186image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:15.711619image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:15.884136image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:16.055183image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:16.229199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:16.399705image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:16.580368image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:16.771032image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:16.965479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:17.155390image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:17.329432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:17.531173image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:17.708763image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:17.879539image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:18.057354image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:18.231083image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:18.421265image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:18.602947image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:18.783444image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:18.973548image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:19.164714image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:19.351990image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:19.537688image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:19.712409image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:19.898070image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:20.083134image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:20.257742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:20.440686image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:20.957659image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:21.142621image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:21.341036image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:21.533962image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:21.727002image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:21.925744image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:22.127345image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:22.325498image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:22.514909image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:22.710232image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:22.906933image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:23.102478image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:23.298485image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:23.488518image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:23.673225image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:23.856057image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:24.051858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:24.313153image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:24.510948image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:24.705864image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:24.897419image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:25.079207image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:25.279207image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:25.473516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:25.654748image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:25.843954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:26.041380image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:26.225880image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:26.411622image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:26.592551image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:26.787094image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:26.983259image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:27.185288image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:27.382336image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:27.565312image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:27.754487image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:27.958847image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:28.139558image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:28.329016image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:28.518492image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:28.685007image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:28.856156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:29.020557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:29.188822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:29.404653image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:29.598235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:29.770574image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:29.932841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:30.106658image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:30.274871image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:30.438485image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:30.605716image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:30.768821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:30.955623image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:31.146720image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:31.438223image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:31.627480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:31.830116image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:32.045436image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:32.244590image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:32.446446image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:32.650991image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:32.851706image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:33.041203image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:33.227450image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:33.413411image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:33.593892image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:33.777912image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:33.962735image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:34.149005image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:34.698909image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:34.889779image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:35.085534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:35.271522image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:35.493783image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:35.686735image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:35.884173image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:36.085146image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:36.269779image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:36.433626image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:36.599473image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:36.792509image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:36.991144image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:37.328788image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:37.654452image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:37.868885image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:38.113860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:38.319713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:38.554512image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:38.783440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:38.980750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:39.165169image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:39.367553image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:39.562401image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:39.759139image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:39.960657image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:40.168882image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:40.373165image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:40.577729image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:40.768492image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:40.975412image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:41.179410image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:41.449088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:41.656288image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:41.864901image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:42.061325image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:42.250651image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:42.437968image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:42.633308image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:42.836497image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:43.034916image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:43.231245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:43.413614image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:43.611241image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:43.806469image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:43.990387image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-08-18T12:56:44.185900image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-08-18T12:56:53.201496image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-18T12:56:53.528724image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-18T12:56:53.852596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-18T12:56:54.182596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-08-18T12:56:44.575954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-18T12:56:45.154624image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idmonetaryrecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.0034.001733.00297.0018.15-35.5017.0040.0050.970.62
11130473232.5956.009.001390.00171.0018.90-27.250.0335.00154.4411.67
22125836495.302.0015.003796.00221.0029.39-23.190.0450.00253.077.13
3313748938.8995.005.00415.0027.0034.77-92.670.020.0083.004.60
4415100876.00333.003.0080.003.00292.00-8.600.0722.0026.670.33
55152914498.0225.0014.001670.0096.0046.85-23.200.0429.00119.294.14
66146885558.187.0021.003396.00316.0017.59-18.300.06399.00161.716.52
77178095401.2016.0012.002006.0058.0093.12-35.700.0341.00167.173.58
881531160402.220.0091.0036978.002326.0025.97-4.140.24474.00406.356.00
99160981991.7187.007.00565.0066.0030.18-47.670.020.0080.714.71

Last rows

df_indexcustomer_idmonetaryrecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
29465613177271060.2515.001.00645.0066.0016.06-6.001.006.00645.0066.00
2947562317232421.522.002.00203.0036.0011.71-12.000.150.00101.5015.00
2948562417468137.0010.002.00116.005.0027.40-4.000.400.0058.002.50
2949563513596692.755.002.00395.00161.004.30-7.000.250.00197.5065.00
29505641148931196.849.002.00642.0069.0017.35-2.000.670.00321.0034.00
2951564512479468.6411.001.00358.0029.0016.16-4.001.0034.00358.0029.00
2952566614126706.137.003.00508.0015.0047.08-3.000.7550.00169.334.67
29535672135211030.481.003.00557.00374.002.76-4.500.300.00185.6790.67
2954568215060281.678.003.00187.00100.002.82-1.001.500.0062.3322.67
2955570112558269.967.001.00196.0011.0024.54-6.001.00196.00196.0011.00